Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 19877 |
| Missing cells | 7042 |
| Missing cells (%) | 2.2% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 2.4 MiB |
| Average record size in memory | 128.0 B |
Variable types
| NUM | 10 |
|---|---|
| CAT | 6 |
name has a high cardinality: 19459 distinct values | High cardinality |
host_name has a high cardinality: 3351 distinct values | High cardinality |
neighbourhood has a high cardinality: 128 distinct values | High cardinality |
last_review has a high cardinality: 1345 distinct values | High cardinality |
last_review has 3513 (17.7%) missing values | Missing |
reviews_per_month has 3513 (17.7%) missing values | Missing |
price is highly skewed (γ1 = 38.46497002) | Skewed |
minimum_nights is highly skewed (γ1 = 34.5447924) | Skewed |
name is uniformly distributed | Uniform |
id has unique values | Unique |
number_of_reviews has 3513 (17.7%) zeros | Zeros |
availability_365 has 2136 (10.7%) zeros | Zeros |
Reproduction
| Analysis started | 2021-03-04 09:20:25.570132 |
|---|---|
| Analysis finished | 2021-03-04 09:20:45.234926 |
| Duration | 19.66 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 19877 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 24448050.07 |
|---|---|
| Minimum | 6499 |
| Maximum | 48142332 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 155.3 KiB |
Quantile statistics
| Minimum | 6499 |
|---|---|
| 5-th percentile | 2139754.4 |
| Q1 | 14099690 |
| median | 24475545 |
| Q3 | 35592480 |
| 95-th percentile | 45214944.8 |
| Maximum | 48142332 |
| Range | 48135833 |
| Interquartile range (IQR) | 21492790 |
Descriptive statistics
| Standard deviation | 13365214.48 |
|---|---|
| Coefficient of variation (CV) | 0.5466781376 |
| Kurtosis | -1.04555167 |
| Mean | 24448050.07 |
| Median Absolute Deviation (MAD) | 10735242 |
| Skewness | -0.07734771589 |
| Sum | 4.859538913e+11 |
| Variance | 1.786289581e+14 |
| Monotocity | Strictly increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 36667392 | 1 | < 0.1% | |
| 37285223 | 1 | < 0.1% | |
| 41708658 | 1 | < 0.1% | |
| 43796576 | 1 | < 0.1% | |
| 15084911 | 1 | < 0.1% | |
| 599406 | 1 | < 0.1% | |
| 3690595 | 1 | < 0.1% | |
| 39386862 | 1 | < 0.1% | |
| 8164712 | 1 | < 0.1% | |
| 29794321 | 1 | < 0.1% | |
| Other values (19867) | 19867 | 99.9% |
| Value | Count | Frequency (%) | |
| 6499 | 1 | < 0.1% | |
| 25659 | 1 | < 0.1% | |
| 29248 | 1 | < 0.1% | |
| 29396 | 1 | < 0.1% | |
| 29915 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 48142332 | 1 | < 0.1% | |
| 48139646 | 1 | < 0.1% | |
| 48136017 | 1 | < 0.1% | |
| 48135943 | 1 | < 0.1% | |
| 48132640 | 1 | < 0.1% |
| Distinct | 19459 |
|---|---|
| Distinct (%) | 97.9% |
| Missing | 10 |
| Missing (%) | 0.1% |
| Memory size | 155.3 KiB |
| BR Guest House | 11 |
|---|---|
| Brand New Hostel in center of Lisbon. | 10 |
| Quinta da Bicuda - Estúdio Bungalow | 9 |
| Amazing 1 Bedroom apartment | 9 |
| Quarto com WC privativo - Arrendamento mensal | 8 |
| Other values (19454) |
| Value | Count | Frequency (%) | |
| BR Guest House | 11 | 0.1% | |
| Brand New Hostel in center of Lisbon. | 10 | 0.1% | |
| Quinta da Bicuda - Estúdio Bungalow | 9 | < 0.1% | |
| Amazing 1 Bedroom apartment | 9 | < 0.1% | |
| Quarto com WC privativo - Arrendamento mensal | 8 | < 0.1% | |
| RCGI Homesweet in Lisboa | 7 | < 0.1% | |
| West Coast Surf Hostel | 6 | < 0.1% | |
| One Bedroom Apartment | 6 | < 0.1% | |
| NLC Rooms & Suites,made by Travelers for Travelers | 6 | < 0.1% | |
| Limmo Garden Guest House - Private Suite | 6 | < 0.1% | |
| Other values (19449) | 19789 | 99.6% | |
| (Missing) | 10 | 0.1% |
Frequencies of value counts
Unique
| Unique | 19198 ? |
|---|---|
| Unique (%) | 96.6% |
Histogram of lengths of the category
Length
| Max length | 132 |
|---|---|
| Median length | 34 |
| Mean length | 34.46324898 |
| Min length | 1 |
host_id
Real number (ℝ≥0)
| Distinct | 8715 |
|---|---|
| Distinct (%) | 43.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 102741377.2 |
|---|---|
| Minimum | 14455 |
| Maximum | 387871064 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 155.3 KiB |
Quantile statistics
| Minimum | 14455 |
|---|---|
| 5-th percentile | 1756107 |
| Q1 | 15191207 |
| median | 63244098 |
| Q3 | 174887261 |
| 95-th percentile | 312536974 |
| Maximum | 387871064 |
| Range | 387856609 |
| Interquartile range (IQR) | 159696054 |
Descriptive statistics
| Standard deviation | 103107284 |
|---|---|
| Coefficient of variation (CV) | 1.003561436 |
| Kurtosis | -0.2604225337 |
| Mean | 102741377.2 |
| Median Absolute Deviation (MAD) | 57755799 |
| Skewness | 0.9213568162 |
| Sum | 2.042190355e+12 |
| Variance | 1.063111202e+16 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 3953109 | 276 | 1.4% | |
| 1756107 | 110 | 0.6% | |
| 104083974 | 100 | 0.5% | |
| 7564916 | 90 | 0.5% | |
| 257927631 | 82 | 0.4% | |
| 1969293 | 80 | 0.4% | |
| 76223539 | 78 | 0.4% | |
| 22192546 | 60 | 0.3% | |
| 5691663 | 59 | 0.3% | |
| 2372087 | 57 | 0.3% | |
| Other values (8705) | 18885 | 95.0% |
| Value | Count | Frequency (%) | |
| 14455 | 1 | < 0.1% | |
| 17096 | 1 | < 0.1% | |
| 51461 | 1 | < 0.1% | |
| 68805 | 1 | < 0.1% | |
| 70933 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 387871064 | 1 | < 0.1% | |
| 387506721 | 1 | < 0.1% | |
| 387380530 | 1 | < 0.1% | |
| 387006368 | 1 | < 0.1% | |
| 386842063 | 1 | < 0.1% |
| Distinct | 3351 |
|---|---|
| Distinct (%) | 16.9% |
| Missing | 6 |
| Missing (%) | < 0.1% |
| Memory size | 155.3 KiB |
| Maria | 460 |
|---|---|
| Ana | 407 |
| Pedro | 346 |
| Feels Like Home | 276 |
| João | 263 |
| Other values (3346) |
| Value | Count | Frequency (%) | |
| Maria | 460 | 2.3% | |
| Ana | 407 | 2.0% | |
| Pedro | 346 | 1.7% | |
| Feels Like Home | 276 | 1.4% | |
| João | 263 | 1.3% | |
| Joana | 234 | 1.2% | |
| Luis | 221 | 1.1% | |
| Ricardo | 217 | 1.1% | |
| Nuno | 199 | 1.0% | |
| Sofia | 184 | 0.9% | |
| Other values (3341) | 17064 | 85.8% |
Frequencies of value counts
Unique
| Unique | 1565 ? |
|---|---|
| Unique (%) | 7.9% |
Histogram of lengths of the category
Length
| Max length | 35 |
|---|---|
| Median length | 6 |
| Mean length | 8.312622629 |
| Min length | 1 |
neighbourhood_group
Categorical
| Distinct | 16 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 155.3 KiB |
| Lisboa | |
|---|---|
| Cascais | |
| Sintra | 1254 |
| Mafra | 1210 |
| Lourinh | 355 |
| Other values (11) | 1085 |
| Value | Count | Frequency (%) | |
| Lisboa | 14100 | 70.9% | |
| Cascais | 1873 | 9.4% | |
| Sintra | 1254 | 6.3% | |
| Mafra | 1210 | 6.1% | |
| Lourinh | 355 | 1.8% | |
| Oeiras | 289 | 1.5% | |
| Torres Vedras | 259 | 1.3% | |
| Loures | 130 | 0.7% | |
| Amadora | 117 | 0.6% | |
| Odivelas | 76 | 0.4% | |
| Other values (6) | 214 | 1.1% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 21 |
|---|---|
| Median length | 6 |
| Mean length | 6.212506918 |
| Min length | 5 |
| Distinct | 128 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 155.3 KiB |
| Santa Maria Maior | |
|---|---|
| Misericrdia | |
| Arroios | |
| Cascais e Estoril | |
| So Vicente | |
| Other values (123) |
| Value | Count | Frequency (%) | |
| Santa Maria Maior | 3170 | 15.9% | |
| Misericrdia | 2378 | 12.0% | |
| Arroios | 1763 | 8.9% | |
| Cascais e Estoril | 1304 | 6.6% | |
| So Vicente | 1217 | 6.1% | |
| Santo Antnio | 1173 | 5.9% | |
| Estrela | 833 | 4.2% | |
| Ericeira | 721 | 3.6% | |
| Avenidas Novas | 645 | 3.2% | |
| S.Maria, S.Miguel, S.Martinho, S.Pedro Penaferrim | 550 | 2.8% | |
| Other values (118) | 6123 | 30.8% |
Frequencies of value counts
Unique
| Unique | 8 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 49 |
|---|---|
| Median length | 12 |
| Mean length | 13.8796096 |
| Min length | 3 |
latitude
Real number (ℝ≥0)
| Distinct | 8428 |
|---|---|
| Distinct (%) | 42.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38.75915047 |
|---|---|
| Minimum | 38.67645 |
| Maximum | 39.29767 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 155.3 KiB |
Quantile statistics
| Minimum | 38.67645 |
|---|---|
| 5-th percentile | 38.69913 |
| Q1 | 38.71094 |
| median | 38.7175 |
| Q3 | 38.73953 |
| 95-th percentile | 38.991854 |
| Maximum | 39.29767 |
| Range | 0.62122 |
| Interquartile range (IQR) | 0.02859 |
Descriptive statistics
| Standard deviation | 0.1092921508 |
|---|---|
| Coefficient of variation (CV) | 0.002819776735 |
| Kurtosis | 8.729100488 |
| Mean | 38.75915047 |
| Median Absolute Deviation (MAD) | 0.00938 |
| Skewness | 2.961007627 |
| Sum | 770415.6338 |
| Variance | 0.01194477422 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 38.71308 | 22 | 0.1% | |
| 38.7385 | 21 | 0.1% | |
| 38.71212 | 21 | 0.1% | |
| 38.73643 | 21 | 0.1% | |
| 38.73046 | 20 | 0.1% | |
| 38.7113 | 20 | 0.1% | |
| 38.72384 | 20 | 0.1% | |
| 38.71161 | 20 | 0.1% | |
| 38.71367 | 20 | 0.1% | |
| 38.71272 | 19 | 0.1% | |
| Other values (8418) | 19673 | 99.0% |
| Value | Count | Frequency (%) | |
| 38.67645 | 3 | < 0.1% | |
| 38.67662 | 1 | < 0.1% | |
| 38.6786 | 1 | < 0.1% | |
| 38.67897 | 1 | < 0.1% | |
| 38.67915 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 39.29767 | 1 | < 0.1% | |
| 39.29704 | 1 | < 0.1% | |
| 39.2968 | 1 | < 0.1% | |
| 39.29641 | 1 | < 0.1% | |
| 39.29619 | 1 | < 0.1% |
longitude
Real number (ℝ)
| Distinct | 9891 |
|---|---|
| Distinct (%) | 49.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -9.206918109 |
|---|---|
| Minimum | -9.49852 |
| Maximum | -8.84009 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 155.3 KiB |
Quantile statistics
| Minimum | -9.49852 |
|---|---|
| 5-th percentile | -9.429526 |
| Q1 | -9.26749 |
| median | -9.14702 |
| Q3 | -9.13485 |
| 95-th percentile | -9.123028 |
| Maximum | -8.84009 |
| Range | 0.65843 |
| Interquartile range (IQR) | 0.13264 |
Descriptive statistics
| Standard deviation | 0.1133625657 |
|---|---|
| Coefficient of variation (CV) | -0.01231275921 |
| Kurtosis | -0.2998830159 |
| Mean | -9.206918109 |
| Median Absolute Deviation (MAD) | 0.01604 |
| Skewness | -1.131796479 |
| Sum | -183005.9113 |
| Variance | 0.0128510713 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| -9.13535 | 26 | 0.1% | |
| -9.14246 | 24 | 0.1% | |
| -9.13214 | 20 | 0.1% | |
| -9.13155 | 19 | 0.1% | |
| -9.15084 | 18 | 0.1% | |
| -9.1477 | 17 | 0.1% | |
| -9.13503 | 16 | 0.1% | |
| -9.13508 | 16 | 0.1% | |
| -9.15036 | 16 | 0.1% | |
| -9.13782 | 16 | 0.1% | |
| Other values (9881) | 19689 | 99.1% |
| Value | Count | Frequency (%) | |
| -9.49852 | 1 | < 0.1% | |
| -9.48808 | 1 | < 0.1% | |
| -9.48789 | 1 | < 0.1% | |
| -9.4829 | 1 | < 0.1% | |
| -9.48251 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| -8.84009 | 1 | < 0.1% | |
| -8.86285 | 1 | < 0.1% | |
| -8.86775 | 1 | < 0.1% | |
| -8.86809 | 1 | < 0.1% | |
| -8.86948 | 1 | < 0.1% |
room_type
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 155.3 KiB |
| Entire home/apt | |
|---|---|
| Private room | |
| Hotel room | 414 |
| Shared room | 369 |
| Value | Count | Frequency (%) | |
| Entire home/apt | 14725 | 74.1% | |
| Private room | 4369 | 22.0% | |
| Hotel room | 414 | 2.1% | |
| Shared room | 369 | 1.9% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 15 |
|---|---|
| Median length | 15 |
| Mean length | 14.16219751 |
| Min length | 10 |
| Distinct | 494 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 95.24812597 |
|---|---|
| Minimum | 0 |
| Maximum | 20199 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 155.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 20 |
| Q1 | 40 |
| median | 60 |
| Q3 | 94 |
| 95-th percentile | 230 |
| Maximum | 20199 |
| Range | 20199 |
| Interquartile range (IQR) | 54 |
Descriptive statistics
| Standard deviation | 260.0588293 |
|---|---|
| Coefficient of variation (CV) | 2.730330142 |
| Kurtosis | 2270.250473 |
| Mean | 95.24812597 |
| Median Absolute Deviation (MAD) | 25 |
| Skewness | 38.46497002 |
| Sum | 1893247 |
| Variance | 67630.5947 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 50 | 866 | 4.4% | |
| 60 | 767 | 3.9% | |
| 40 | 614 | 3.1% | |
| 45 | 591 | 3.0% | |
| 30 | 515 | 2.6% | |
| 70 | 510 | 2.6% | |
| 80 | 507 | 2.6% | |
| 65 | 507 | 2.6% | |
| 100 | 489 | 2.5% | |
| 55 | 483 | 2.4% | |
| Other values (484) | 14028 | 70.6% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 8 | 3 | < 0.1% | |
| 9 | 37 | 0.2% | |
| 10 | 51 | 0.3% | |
| 11 | 36 | 0.2% |
| Value | Count | Frequency (%) | |
| 20199 | 1 | < 0.1% | |
| 9999 | 2 | < 0.1% | |
| 8578 | 1 | < 0.1% | |
| 8125 | 1 | < 0.1% | |
| 8000 | 3 | < 0.1% |
| Distinct | 56 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.797504654 |
|---|---|
| Minimum | 1 |
| Maximum | 1000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 155.3 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 7 |
| Maximum | 1000 |
| Range | 999 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 16.30213814 |
|---|---|
| Coefficient of variation (CV) | 4.292855343 |
| Kurtosis | 1731.116252 |
| Mean | 3.797504654 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 34.5447924 |
| Sum | 75483 |
| Variance | 265.759708 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 2 | 7027 | 35.4% | |
| 1 | 5333 | 26.8% | |
| 3 | 4395 | 22.1% | |
| 4 | 1003 | 5.0% | |
| 5 | 750 | 3.8% | |
| 7 | 464 | 2.3% | |
| 30 | 243 | 1.2% | |
| 6 | 152 | 0.8% | |
| 15 | 132 | 0.7% | |
| 28 | 78 | 0.4% | |
| Other values (46) | 300 | 1.5% |
| Value | Count | Frequency (%) | |
| 1 | 5333 | 26.8% | |
| 2 | 7027 | 35.4% | |
| 3 | 4395 | 22.1% | |
| 4 | 1003 | 5.0% | |
| 5 | 750 | 3.8% |
| Value | Count | Frequency (%) | |
| 1000 | 2 | < 0.1% | |
| 730 | 1 | < 0.1% | |
| 400 | 3 | < 0.1% | |
| 365 | 3 | < 0.1% | |
| 364 | 1 | < 0.1% |
| Distinct | 429 |
|---|---|
| Distinct (%) | 2.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 42.8716104 |
|---|---|
| Minimum | 0 |
| Maximum | 802 |
| Zeros | 3513 |
| Zeros (%) | 17.7% |
| Memory size | 155.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 13 |
| Q3 | 55 |
| 95-th percentile | 187 |
| Maximum | 802 |
| Range | 802 |
| Interquartile range (IQR) | 53 |
Descriptive statistics
| Standard deviation | 67.66015947 |
|---|---|
| Coefficient of variation (CV) | 1.57820429 |
| Kurtosis | 9.628760081 |
| Mean | 42.8716104 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 2.671232755 |
| Sum | 852159 |
| Variance | 4577.89718 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 3513 | 17.7% | |
| 1 | 1348 | 6.8% | |
| 2 | 939 | 4.7% | |
| 3 | 710 | 3.6% | |
| 4 | 581 | 2.9% | |
| 5 | 482 | 2.4% | |
| 6 | 406 | 2.0% | |
| 7 | 378 | 1.9% | |
| 8 | 350 | 1.8% | |
| 9 | 305 | 1.5% | |
| Other values (419) | 10865 | 54.7% |
| Value | Count | Frequency (%) | |
| 0 | 3513 | 17.7% | |
| 1 | 1348 | 6.8% | |
| 2 | 939 | 4.7% | |
| 3 | 710 | 3.6% | |
| 4 | 581 | 2.9% |
| Value | Count | Frequency (%) | |
| 802 | 1 | < 0.1% | |
| 727 | 1 | < 0.1% | |
| 695 | 1 | < 0.1% | |
| 681 | 1 | < 0.1% | |
| 626 | 1 | < 0.1% |
| Distinct | 1345 |
|---|---|
| Distinct (%) | 8.2% |
| Missing | 3513 |
| Missing (%) | 17.7% |
| Memory size | 155.3 KiB |
| 2021-01-02 | 247 |
|---|---|
| 2021-01-03 | 160 |
| 2021-01-01 | 158 |
| 2020-03-15 | 151 |
| 2020-01-02 | 146 |
| Other values (1340) |
| Value | Count | Frequency (%) | |
| 2021-01-02 | 247 | 1.2% | |
| 2021-01-03 | 160 | 0.8% | |
| 2021-01-01 | 158 | 0.8% | |
| 2020-03-15 | 151 | 0.8% | |
| 2020-01-02 | 146 | 0.7% | |
| 2021-01-31 | 134 | 0.7% | |
| 2020-01-01 | 131 | 0.7% | |
| 2020-03-16 | 123 | 0.6% | |
| 2020-01-03 | 104 | 0.5% | |
| 2020-03-13 | 102 | 0.5% | |
| Other values (1335) | 14908 | 75.0% | |
| (Missing) | 3513 | 17.7% |
Frequencies of value counts
Unique
| Unique | 397 ? |
|---|---|
| Unique (%) | 2.4% |
Histogram of lengths of the category
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 8.762841475 |
| Min length | 3 |
| Distinct | 604 |
|---|---|
| Distinct (%) | 3.7% |
| Missing | 3513 |
| Missing (%) | 17.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.159753728 |
|---|---|
| Minimum | 0.01 |
| Maximum | 44.75 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 155.3 KiB |
Quantile statistics
| Minimum | 0.01 |
|---|---|
| 5-th percentile | 0.05 |
| Q1 | 0.24 |
| median | 0.72 |
| Q3 | 1.73 |
| 95-th percentile | 3.62 |
| Maximum | 44.75 |
| Range | 44.74 |
| Interquartile range (IQR) | 1.49 |
Descriptive statistics
| Standard deviation | 1.250472079 |
|---|---|
| Coefficient of variation (CV) | 1.078222082 |
| Kurtosis | 93.84222368 |
| Mean | 1.159753728 |
| Median Absolute Deviation (MAD) | 0.57 |
| Skewness | 4.101096497 |
| Sum | 18978.21 |
| Variance | 1.56368042 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0.05 | 327 | 1.6% | |
| 0.06 | 280 | 1.4% | |
| 0.09 | 254 | 1.3% | |
| 0.07 | 245 | 1.2% | |
| 0.16 | 244 | 1.2% | |
| 0.11 | 204 | 1.0% | |
| 0.03 | 203 | 1.0% | |
| 0.17 | 197 | 1.0% | |
| 0.1 | 178 | 0.9% | |
| 0.13 | 174 | 0.9% | |
| Other values (594) | 14058 | 70.7% | |
| (Missing) | 3513 | 17.7% |
| Value | Count | Frequency (%) | |
| 0.01 | 29 | 0.1% | |
| 0.02 | 123 | 0.6% | |
| 0.03 | 203 | 1.0% | |
| 0.04 | 166 | 0.8% | |
| 0.05 | 327 | 1.6% |
| Value | Count | Frequency (%) | |
| 44.75 | 1 | < 0.1% | |
| 15.55 | 1 | < 0.1% | |
| 12.49 | 1 | < 0.1% | |
| 11.42 | 1 | < 0.1% | |
| 9.85 | 1 | < 0.1% |
calculated_host_listings_count
Real number (ℝ≥0)
| Distinct | 50 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.23122202 |
|---|---|
| Minimum | 1 |
| Maximum | 276 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 155.3 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 3 |
| Q3 | 10 |
| 95-th percentile | 48 |
| Maximum | 276 |
| Range | 275 |
| Interquartile range (IQR) | 9 |
Descriptive statistics
| Standard deviation | 35.47728739 |
|---|---|
| Coefficient of variation (CV) | 2.681331123 |
| Kurtosis | 39.58234138 |
| Mean | 13.23122202 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 5.96889533 |
| Sum | 262997 |
| Variance | 1258.63792 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1 | 5766 | 29.0% | |
| 2 | 2482 | 12.5% | |
| 3 | 1701 | 8.6% | |
| 4 | 1308 | 6.6% | |
| 5 | 995 | 5.0% | |
| 6 | 912 | 4.6% | |
| 7 | 602 | 3.0% | |
| 8 | 504 | 2.5% | |
| 10 | 470 | 2.4% | |
| 9 | 441 | 2.2% | |
| Other values (40) | 4696 | 23.6% |
| Value | Count | Frequency (%) | |
| 1 | 5766 | 29.0% | |
| 2 | 2482 | 12.5% | |
| 3 | 1701 | 8.6% | |
| 4 | 1308 | 6.6% | |
| 5 | 995 | 5.0% |
| Value | Count | Frequency (%) | |
| 276 | 276 | 1.4% | |
| 110 | 110 | 0.6% | |
| 100 | 100 | 0.5% | |
| 90 | 90 | 0.5% | |
| 82 | 82 | 0.4% |
| Distinct | 366 |
|---|---|
| Distinct (%) | 1.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 233.7686774 |
|---|---|
| Minimum | 0 |
| Maximum | 365 |
| Zeros | 2136 |
| Zeros (%) | 10.7% |
| Memory size | 155.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 122 |
| median | 278 |
| Q3 | 361 |
| 95-th percentile | 365 |
| Maximum | 365 |
| Range | 365 |
| Interquartile range (IQR) | 239 |
Descriptive statistics
| Standard deviation | 133.0285016 |
|---|---|
| Coefficient of variation (CV) | 0.5690604193 |
| Kurtosis | -1.132030803 |
| Mean | 233.7686774 |
| Median Absolute Deviation (MAD) | 87 |
| Skewness | -0.6048257352 |
| Sum | 4646620 |
| Variance | 17696.58223 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 365 | 2701 | 13.6% | |
| 0 | 2136 | 10.7% | |
| 364 | 1498 | 7.5% | |
| 180 | 477 | 2.4% | |
| 363 | 434 | 2.2% | |
| 179 | 386 | 1.9% | |
| 45 | 314 | 1.6% | |
| 320 | 281 | 1.4% | |
| 362 | 267 | 1.3% | |
| 90 | 266 | 1.3% | |
| Other values (356) | 11117 | 55.9% |
| Value | Count | Frequency (%) | |
| 0 | 2136 | 10.7% | |
| 1 | 152 | 0.8% | |
| 2 | 20 | 0.1% | |
| 3 | 17 | 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 365 | 2701 | 13.6% | |
| 364 | 1498 | 7.5% | |
| 363 | 434 | 2.2% | |
| 362 | 267 | 1.3% | |
| 361 | 158 | 0.8% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| id | name | host_id | host_name | neighbourhood_group | neighbourhood | latitude | longitude | room_type | price | minimum_nights | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 6499 | Belém 1 Bedroom Historical Apartment | 14455 | Bruno | Lisboa | Belm | 38.69750 | -9.19768 | Entire home/apt | 40 | 3 | 27 | 2021-01-26 | 0.34 | 1 | 341 |
| 1 | 25659 | Heart of Alfama - Lisbon Center | 107347 | Ellie | Lisboa | Santa Maria Maior | 38.71167 | -9.12696 | Entire home/apt | 30 | 10 | 113 | 2019-12-08 | 1.36 | 1 | 108 |
| 2 | 29248 | Apartamento Alfama com vista para o rio! | 125768 | Bárbara | Lisboa | Santa Maria Maior | 38.71272 | -9.12628 | Entire home/apt | 38 | 3 | 325 | 2021-01-10 | 2.64 | 1 | 303 |
| 3 | 29396 | Alfama Hill - Boutique apartment | 126415 | Mónica | Lisboa | Santa Maria Maior | 38.71156 | -9.12987 | Entire home/apt | 25 | 2 | 265 | 2021-01-22 | 2.49 | 2 | 323 |
| 4 | 29915 | Modern and Cool Apartment in Lisboa | 128890 | Sara | Lisboa | Avenidas Novas | 38.74712 | -9.15286 | Entire home/apt | 48 | 5 | 40 | 2021-01-24 | 0.31 | 1 | 294 |
| 5 | 33348 | Happy Season | 144484 | Bruno | Lisboa | Lumiar | 38.76381 | -9.15256 | Private room | 40 | 1 | 2 | 2011-07-22 | 0.02 | 2 | 0 |
| 6 | 42519 | Nice Apart.BAIRRO ALTO (ADAMASTOR) 6-1º | 136230 | David | Lisboa | Misericrdia | 38.70896 | -9.14938 | Entire home/apt | 50 | 1 | 114 | 2020-03-08 | 1.00 | 11 | 258 |
| 7 | 48025 | Apartment for renting in Lisbon | 218778 | José | Lisboa | Misericrdia | 38.71309 | -9.14392 | Entire home/apt | 65 | 5 | 18 | 2020-09-10 | 0.15 | 5 | 365 |
| 8 | 48058 | Small House Downtown Cascais | 218990 | Pim | Cascais | Cascais e Estoril | 38.69650 | -9.42571 | Entire home/apt | 80 | 5 | 33 | 2020-07-23 | 0.46 | 1 | 218 |
| 9 | 48854 | Comfortable 4BR in historical villa, near market | 222551 | Dagmar | Sintra | S.Maria, S.Miguel, S.Martinho, S.Pedro Penaferrim | 38.80380 | -9.37970 | Entire home/apt | 161 | 5 | 45 | 2020-12-23 | 0.67 | 1 | 0 |
Last rows
| id | name | host_id | host_name | neighbourhood_group | neighbourhood | latitude | longitude | room_type | price | minimum_nights | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 19867 | 48085251 | GuestReady - Cozy Black Bricked Flat in Vibrant Graça Quarters | 341831718 | Joao | Lisboa | Arroios | 38.72597 | -9.13108 | Entire home/apt | 36 | 1 | 0 | NaN | NaN | 1 | 365 |
| 19868 | 48099900 | Lovely 2 Bedroom Duplex Apt w/Terrace in Cascais | 387871064 | Márcia | Cascais | Alcabideche | 38.71688 | -9.42668 | Entire home/apt | 64 | 2 | 0 | NaN | NaN | 1 | 178 |
| 19869 | 48100008 | Lisboa Lovely Apartment Bairro Alto | 10058743 | Rentportugal | Lisboa | Misericrdia | 38.71195 | -9.14374 | Entire home/apt | 20 | 2 | 1 | 2021-02-12 | 1.0 | 12 | 269 |
| 19870 | 48121569 | Casa centro cascais | 1827516 | Karla Lamounier | Cascais | Cascais e Estoril | 38.69895 | -9.42395 | Entire home/apt | 392 | 1 | 0 | NaN | NaN | 3 | 73 |
| 19871 | 48130409 | Moradia T3 nos Jardins da Parede | Jardim | 19254511 | Goncalo | Cascais | Cascais e Estoril | 38.70380 | -9.39386 | Entire home/apt | 90 | 30 | 0 | NaN | NaN | 2 | 364 |
| 19872 | 48132640 | Sky Room | 27809636 | Miguel | Mafra | Ericeira | 38.96054 | -9.41491 | Private room | 80 | 1 | 0 | NaN | NaN | 5 | 365 |
| 19873 | 48135943 | Hera 1 By Innkeeper | 203773123 | Innkeeper | Lisboa | Misericrdia | 38.71197 | -9.14780 | Entire home/apt | 33 | 2 | 0 | NaN | NaN | 31 | 359 |
| 19874 | 48136017 | Hera 2 By Innkeeper | 203773123 | Innkeeper | Lisboa | Misericrdia | 38.71038 | -9.14749 | Entire home/apt | 33 | 2 | 0 | NaN | NaN | 31 | 365 |
| 19875 | 48139646 | Dom Durão Village House, no sopé de Montejunto | 335615367 | Carlota | Cadaval | Lamas e Cercal | 39.22942 | -9.07565 | Entire home/apt | 32 | 2 | 0 | NaN | NaN | 2 | 364 |
| 19876 | 48142332 | Guincho Countryside bedroom with private bathroom | 38921144 | Pedro | Cascais | Alcabideche | 38.73939 | -9.43694 | Private room | 24 | 3 | 0 | NaN | NaN | 2 | 364 |